Tuning Methods in Statistical Machine Translation
نویسنده
چکیده
In a Statistical Machine Translation system many models, called features, complement each other in producing natural language translations. In how far we should rely on a certain feature is governed by parameters, or weights. Learning these weights is the sub field of SMT, called parameter tuning, that is addressed in this thesis. Three existing methods for learning such parameters are compared. We recast MERT, MIRA and Downhill Simplex in a uniform framework, to allow for easy and consistent comparison. Based on our findings and forthcoming opportunities for improvements, we introduce two new methods. A straightforward sampling approach, Local Unimodal Sampling (LUS), that uniformly samples from a decreasing area around a constantly updated peak in weightvector space. And a ranking based approach, implementing SVM-Rank, that focusses on giving, besides the best translations, also its runner-ups a high score. We empirically compare our own methods to existing methods and find that LUS slightly, but significantly, outperforms the state-of-the-art MERT method in a realistic setting with 14 features. We claim that this progress, the simplicity of the radically different approach of the method obtaining this progress and the clear overview of existing work are contributions to the field. Our SVM-Rank showed no improvement over the-state-of-the-art within our experimental setup.
منابع مشابه
Search-Aware Tuning for Machine Translation
Parameter tuning is an important problem in statistical machine translation, but surprisingly, most existing methods such as MERT, MIRA and PRO are agnostic about search, while search errors could severely degrade translation quality. We propose a searchaware framework to promote promising partial translations, preventing them from being pruned. To do so we develop two metrics to evaluate parti...
متن کاملTuning Statistical Machine Translation Parameters
Word alignment is the basis of statistical machine translation. GIZA++ is a popular tool for producing word alignments and translation models. It uses a set of parameters that affect the quality of word alignments and translation models. These parameters exist to overcome some problems such as overfitting. This paper addresses the problem of tuning GIZA++ parameter for better translation qualit...
متن کاملPhrasal: A Toolkit for New Directions in Statistical Machine Translation
We present a new version of Phrasal, an open-source toolkit for statistical phrasebased machine translation. This revision includes features that support emerging research trends such as (a) tuning with large feature sets, (b) tuning on large datasets like the bitext, and (c) web-based interactive machine translation. A direct comparison with Moses shows favorable results in terms of decoding s...
متن کاملTuning machine translation parameters with SPSA
Most of statistical machine translation systems are combinations of various models, and tuning of the scaling factors is an important step. However, this optimisation problem is hard because the objective function has many local minima and the available algorithms cannot achieve a global optimum. Consequently, optimisations starting from different initial settings can converge to fairly differe...
متن کاملThe Effect of Translationese on Tuning for Statistical Machine Translation
We explore how the translation direction in the tuning set used for statistical machine translation affects the translation results. We explore this issue for three language pairs. While the results on different metrics are somewhat conflicting, using tuning data translated in the same direction as the translation systems tends to give the best length ratio and Meteor scores for all language pa...
متن کاملDrem: The AFRL Submission to the WMT15 Tuning Task
We define a new algorithm, named “Drem”, for tuning the weighted linear model in a statistical machine translation system. Drem has two major innovations. First, it uses scaled derivative-free trust-region optimization rather than other methods’ line search or (sub)gradient approximations. Second, it interpolates the decoder output, using information about which decodes produced which translati...
متن کامل